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Abstract 



Given strings P and Q the (exact) string matching problem is to find all positions of substrings in 
Q matching P. The classical Knuth-Morris-Pratt algorithm [SIAM J. Comput., 1977] solves the string 
matching problem in linear time which is optimal if we can only read one character at the time. However, 
most strings are stored in a computer in a packed representation with several characters in a single word, 
giving us the opportunity to read multiple characters simultaneously. In this paper we study the worst- 
case complexity of string matching on strings given in packed representation. Let m < n be the lengths 
P and Q, respectively, and let a denote the size of the alphabet. On a standard unit-cost word-RAM 
with logarithmic word size we present an algorithm using time 



Here occ is the number of occurrences of P in Q. For m — o(n) this improves the 0(n) bound of the 
Knuth-Morris-Pratt algorithm. Furthermore, if m = 0(n/log CT n) our algorithm is optimal since any 
algorithm must spend at least fl( i^^ S * +°cc) — ^( log " - +occ) time to read the input and report all 
occurrences. The result is obtained by a novel automaton construction based on the Knuth-Morris-Pratt 
algorithm combined with a new compact representation of subautomata allowing an optimal tabulation- 
based simulation. 

1 Introduction 

Given strings P and Q of length m and n, respectively, the (exact) string matching problem is to report all 
positions of substrings in Q matching P. The string matching problem is perhaps the most basic problem 
in combinatorial pattern matching and also one of the most well-studied, see e.g. [6, 9, 15, 17] for classical 
textbook algorithms and the surveys in [14,20]. The first worst-case 0(n) algorithm (we assume w.l.o.g. 
that m < n) is the classical Knuth-Morris-Pratt algorithm [17]. If we assume that we can read only one 
character at the time this bound is optimal since we need f2(n) time to read the input. However, most strings 
are stored in a computer in a packed representation with several characters in a single word. For instance, 
DNA-sequences have an alphabet of size 4 and are therefore typically stored using 2 bit per character with 32 
characters in a 64-bit word. On packed strings we can read multiple characters in constant time and hence 
potentially do better that the il(n) lower bound for string matching. In this paper we study the worst-case 
complexity of packed string matching and present an algorithm to beat the O(n) lower bound for almost all 
combinations of m and n. 

*An externded abstract of this paper appeared at the 20th Annual Symposium on Combinatorial Pattern Matching. 
' Supported by the Danish Agency for Science, Technology, and Innovation. 
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1.1 Setup and Results 



We assume a standard unit-cost word RAM with word length w = 9(logn) and a standard instruction set 
including arithmetic operations, bitwise boolean operations, and shifts. The space complexity is the number 
of words used by the algorithm, not counting the input which is assumed to be read-only. All strings in 
this paper are over an alphabet £ of size a. The packed representation of a string S is obtained by storing 
0(logn/ logcr) characters per word thus representing S in 0(\S\ logo - / log n) = 0{\S\/\og a n) words. If S 
is given in the packed representation we simply say that S is a packed string. The packed string matching 
problem is defined as above except that P and Q are packed strings. In the worst-case any algorithm for 
packed string matching must examine all of the words in the packed representation of the input strings. The 



time, where occ denotes the number of occurrences of P in Q. In this paper we present an algorithm with 
the following complexity. 

Theorem 1 For packed strings P and Q of length m and n, respectively, with characters from an alphabet of 
size a, we can solve the packed string matching problem in time O ^ log " - + m + occ^j and space 0(n E + m) 
for any constant e, < e < 1. 

For m = o(n) this improves the 0{n) bound of the Knuth-Morris-Pratt algorithm. Furthermore, if m — 
Oinj dogg. n) our algorithm matches the lower bound and is therefore optimal. In practical situations m is 
typically much smaller than n and therefore this condition is almost always satisfied. 

1.2 Techniques 

The KMP-algorithm [17] may be viewed as simulating an automaton K according to the characters from Q 
in a left-to-right order. At each character in Q we use K to maintain the longest prefix of P matching the 
current suffix of Q. Improvements of automaton-based algorithms can often be obtained by partitioning the 
automaton into many small subautomata, tabulate relevant information for the subautomata, and use the 
tables to speed-up the simulation in each subautomaton [18,19,24]. This idea is also known as the "Four 
Russian Technique" after Arlazarov et al. [5] . 

However, if we attempt to apply this idea to the KMP-algorithm two major problems appear. First, 
the structure of the transitions in K does not in general allow us to partition K into subautomata such 
that a simulation does not change subautomata too often. Indeed, for any partition we might be forced to 
repeatedly change subautomaton after every group of O(l) characters of Q and hence end up using Cl(n) 
time. Secondly, even if we could design a suitable partition of K into subautomata we have to compactly 
encode the transitions of the subautomata in order for the tabulation to be efficient. An explicit list of such 
transitions will not suffice to achieve the bound of Theorem [TJ The main contribution of this paper are two 
new ideas to overcome these problems. 

First, we present the segment automaton, C, derived from K. In C, the states of K are grouped into 
overlapping intervals of r — 0(logn/ logcr) states from K such that (almost all of) the states in K are 
duplicated in C . We show how to selectively "copy" the transitions from K to C such that the total number 
of transitions between subautomata never exceeds 0(n/r) in the simulation on Q. Secondly, we show how to 
exploit structural properties of the transitions to represent subautomata optimally. This allows us to tabulate 
paths of transitions for all subautomata of size < r using 0(o~ r + m) = 0(n e + m) space and preprocessing 
time for a suitably chosen r. The simulation can then be performed in time 0(n/r + occ) = 0[n/ log CT n+occ) 
leading to Theorem [TJ 

This main contribution of this paper is theoretical, however, we believe that both the segment automaton 
and the compact representation of automata may prove very useful in practice if combined with ideas from 
other algorithms for packed matching. 



algorithm must also report all occurrences of P in Q and therefore must 
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1.3 Related Work 



Exploiting packed string representations to speed-up string matching is not a new idea and is even mentioned 
in the early papers by Knuth et al. and Boyer and Moore [9,17]. More recently, several packed string matching 
algorithms have appeared [8, 10-13, 16,22]. However, none of these improve the worst-case 0(n) bound of 
the classical KMP-algorithm. 

It is possible to extend the "super-alphabet" technique by Fredriksson [12, 13] to obtain a simple trade- 
off for packed string matching. The idea is to build an automaton that, similar to the KMP-automaton, 
maintains the longest prefix of P matching the current suffix of Q but allows Q to be processed in groups of 
r characters. Each state has a r outgoing transitions corresponding to all combinations of r characters. This 
algorithm uses 0(n/r + ma r ) time and 0(ma r ) space. Choosing r = e\og a n this is 0(n/ \og a n + mn e ) 
time and 0(mn e ) space. Compared to Theorem [1] this is a factor 0(m) worse in space and only improves 
the 0{n) time bound of the KMP-algorithm when to = o(n 1 ~ e ). 

Packed string matching is closely related to the area of compressed pattern matching introduced by Amir 
and Benson [2,3]. Here the goal is to search for a uncompressed pattern in a compressed text without 
decompressing it first. Furthermore, the search should be faster than the naive approach of decompressing 
the text first and then using the fastest algorithm for the uncompressed problem. In fully compressed 
pattern matching the pattern is also given in compressed form. Several algorithms for (fully) compressed 
string matching are known, see e.g., the survey by Rytter [21]. For instance, if Q is compressed with the 
Ziv-Lempel- Welch scheme [23] into a string Z of length z, Amir et al. [4] showed how to find all occurrences 
of P in time 0(m 2 + z). The packed representation of a string may be viewed as the most basic way to 
compress a string. Hence, in this perspective we are studying the fully compressed string matching problem 
for packed strings. Note that our result is optimal if the pattern is not packed. 

1.4 Outline 

In Section [3] we first review the KMP-algorithm before presenting the segment automaton in Section [5J In 
Section [5] we show how to compactly represent and efficiently tabulate subautomata and in Section [5] we 
present the complete algorithm. Finally, in Section [S] we conclude with some remarks and open problems. 

2 The Knuth-Morris-Pratt Automaton and String Matching 

In this section we briefly review KMP-algorithm [17], which will be the starting point of our new algorithm. 

Let S be a string of length |5| on an alphabet E. The character at position i in S is denoted S[i] and 
the substring from position i to j is denoted by S[i,j]. The substrings S[l, j] and S[i, \S\] are the prefixes 
and suffixes of S, respectively. 

The Knuth-Morris-Pratt automaton (KMP-automaton), denoted K(P), for P consists of m + 1 states 
identified by the integer {0,...,m} each corresponding to a prefix of P. From state s to state s + 1, 
< s < to there is a forward transition labeled P[s\. We call the rightmost forward transition from to — 1 to 
to the accepting transition. From state s, < s < to, there is a failure transition to a state denoted fail(s) 
such that P[l,fail(s)] is the longest prefix of P matching a proper suffix of P[l,s]. Fig. [TJa) depicts the 
KMP-automaton for the pattern P = ababca. 

The failure transitions form a tree with root in state and with the property that fail(s) < s for any 
state s. Since the longest prefixes of P[l, s] and P[l, s + 1] matching a suffix of P can increase by at most 
one character we have the following property of failure transitions. 

Lemma 1 Let P be a string of length to and K(P) be the KMP-automaton for P. For any state 1 < s < m, 
fail(s + l) <fail(s) + l. 

We will exploit this property in Section 14.11 to compactly encode subautomata of the KMP-automaton. The 
KMP-automaton can be constructed in time 0(m) [17]. 

To find the occurrences of P in Q we read the characters of Q from left-to-right while traversing K(P) to 
maintain the longest prefix of P matching a suffix of the current prefix of Q as follows. Initially, we set the 
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Figure 1: (a) The Knuth-Morris-Pratt automaton K(P) for the pattern P = ababca. Solid lines are forward 
transitions and dashed lines are failure transitions, (b)-(c) The corresponding segment automaton C(P, 4) 
for P consisting of 3 segments with 4, 4, and 3 states. The light transitions are shown in (b) and the heavy 
transition transitions in (c). 

state of K(P) to 0. Suppose that we are in state s after reading the k — 1 characters of Q, i.e., the longest 
prefix of P matching a suffix of Q[l,k— 1] is P[l, s]. We process the next character a — Q[k] as follows. If a 
matches the label of the forward transition from s the next state is s + 1. Furthermore, if this transition is 
the accepting transition then k + 1 is the endpoint of a substring of Q matching P and we therefore report an 
occurrence. Otherwise, (a does not match the label of the forward transition from s to s + 1) we recursively 
follow failure transitions from s until we find a state s' whose forward transition is labeled a in which case 
the next state is s' + 1, or if no such state exist we set the next state to be 0. We define the simulation of 
K(P) on Q to be sequence of transitions traversed by the algorithm. 

Each time the simulation on Q follows a forward transition we continue to the next character and hence 
the total number of forward transitions is at most n. Each failure transition strictly decrease the current 
state number while forward transitions increase the state number by 1 . Since we start in state the number 
of failure transition is therefore at most the number of forward transitions. Hence, the total number of 
transitions is at most 2n and therefore the searching takes 0(n) time. In total the KMP-algorithm uses time 
0(n + m) = 0(n). 

3 The Segment Automaton 

In this section we introduce a simple automaton called the segment automaton. The segment automaton for 
P is equivalent to K(P) in the sense that the simulation on Q at each step provides the longest prefix P 
matching a suffix of the current prefix of Q. The segment automaton allows to easily decompose K{P) into 
subautomata of a given size r such that the simulation on Q passes through at most 0(n/r) subautomata. 

Let K = K(P) be the KMP-automaton for P. For a even integer parameter r, 1 < r < m + 1 we define 
the segment automaton with parameter r, denoted C(P, r), as follows. Define a segment S to be an interval 
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S = [I, r], < I < r < m, of states in K(P) and let \S\ = r — l + 1 denote the size of S. Divide the m + 1 states 
of K into a set of z — |~(m + l)/r] overlapping segments, denoted 55 1 = {So, . . . , S z —i}, where = [Zj, rj is 
defined by 

r 

- - n = mm(l t + r - 1, to). 

Thus, each segment in SiS 1 consists of r consecutive states from K, except the last segment, S z -\, which may 
be smaller. Any state s in K appears in at most 2 segments and adjacent segments share r/2 states. 

The segment automaton C = C(P, r) is obtained by adding \S\ states for each segment S £ SS and then 
selectively "copying" transitions from K to C. Specifically, the states of C is the set of pairs given by 

\ 0<i<z,0<j<\Si\}. 

We view each state C as the jth state of the zth segment, i.e., state corresponds to the state k +j 
in K. Hence, each state in K is represented by 1 or 2 states in C and each state in C uniquely corresponds 
to a state in K. 

We copy transitions from K to C in the following way. Let t = (s, s') be a transition in K. For each 
segment Si such that s £ [U, Tj\ we have the following transitions in C: 

• If s' £ [U,ri] there is a light transition from (i, s — U) to (i, s' — Z,). 

• If s' [ii,ri] there is a heavy transition from (i, s — U) to (i', s' — li>), where either 5V is the unique 
segment containing s' or if two segments contain s, then S^ is the segment such that s' £ +r/2], 
i.e., the segment containing s' in the leftmost half. 

If t is a forward transition with label a £ S it is also a forward transition in C with label a, if t is a failure 
transition it is also a failure transition in C, and if t the accepting transition it is also an accepting transition 
in C . The segment automaton with r — 4 corresponding to the KMP-automaton of Fig. [lja) is shown in 
Fig. []Jb) an d (c) showing the light and heavy transitions, respectively. From the correspondence between C 
and K we have that each accepting transition in a simulation of C on Q corresponds to an occurrence of P 
in Q. Hence, we can solve string matching by simulating C instead of K. 
We will use the following key property of the C. 

Lemma 2 For a string P of length to and even integer parameter 1 < r < m + 1, the simulation of the 
segment automaton, C(P,r), on a string Q of length n contains at most 0(n/r + occ) heavy and accepting 
transitions. 

Proof. Consider the sequence T of transitions in the simulation of C = C(P, r) on Q. Let -/V accep t denote the 
number of accepting transitions, and let iVhfarward and -/Vhfaii denote the number of heavy forward and heavy 
failure transitions, respectively. Each accepting transition in T corresponds to an occurrence and therefore 
-^accept = occ - F° r a state (hj) m C w g w ih refer to i as the segment number. Since a forward transition in 
K increases the state number by 1 in K a heavy forward transition increases the segment number by 1 or 2 
in C . A heavy failure transition strictly decrease the segment number. Hence, since we start the simulation 
in segment 0, we can have at most 2 heavy failure transitions for each heavy forward transition in T and 
therefore 

-Vhfaii < 2A r hforward- (1) 

If Miforward = the results trivially follows. Hence, suppose that iVhf orwa rd > 0. Before the first heavy 
forward transition in T there must be at least r — 1 light transitions in order to reach state (0, r — 1) . Consider 
the subsequence of transitions t in T between an arbitrary heavy transition h and a forward heavy transition 
/. The heavy transition h cannot end in segment z — 1 since there is no heavy forward transition from here. 
All other heavy transitions have an endpoint in the leftmost half of a segment and therefore at least r/2 
light transitions are needed before a heavy forward transition can occur. Recall that the total number of 
transition in T is at most 2n and therefore the number of heavy forward transitions in T is bounded by 

^forward < 2n/(r/2) = An/r. (2) 
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Combining the bound on -/V acccpt with ([T]) and @ we have that the total number of heavy and accepting 
transitions is 

^forward + A?hfaU + -^accept < 3iVhforward + OCC = 0(n/r + OCc). 

□ 



4 Representing Segments 
4.1 A Compact Encoding 

Let S be a segment with r states over an alphabet of size a. We show how to compactly represent all light 
transitions in S using 0(r log a) bits. To represent forward transitions we simply store the labels of the r — 1 
light forward transitions in S using (r — 1) logcr = 0(r log a) bits. Next consider the failure transitions. A 
straightforward approach is to explicitly store for each state sGSa bit indicating if its failure pointer is 
light or heavy and, if it is light, a pointer to fail(s). Each pointer requires [logr] bits and hence the total 
cost for this representation is O(rlogr) bits. We show how to improve this to 0(r) bits in the following. 

First, we locally enumerate the states in S to [0, r — 1]. Let I = {ix, . . . < i\ < ■ ■ ■ < ig < r, be 
the set of states in S with a light failure transition and let F — . . . , fi e } be the set of failure pointers 
for the states in I. We encode J as a bit string Bj of length r such that B][j] = 1 iff j € I. This uses r 
bits. To represent F compactly we encode /i and the sequence of differences between consecutive elements 
D = di 2 , . . . ,d% t , where = A, — fu-i- We represent /i explicitly using [logr] bits. Our representation 
of D consists of 2 bit strings. The first string, denoted Bd, is the concatenation of the binary encoding of 
the numbers in D, i.e., Bd = bin(<i; 2 ) • • •bin(di 4 ), where bin(-) denotes standard two's complement binary 
encoding (the differences may be negative) and • denotes concatenation. Each number dj uses at most 
1 + log \dj\ bits and therefore the size of the Bp is at most 

\b d \< EdM^OI + 1) < r + E l lo s(^)l, ( 3 ) 

jei> jei' 

where I' = The second bit string, denoted Bjji, represents the boundaries of the numbers in Bp, 

i.e., Bu'[k] = 1 iff k is the start of a new number in Bp. Thus, \Bw \ = \Bd\- Note that with /i, Bd, and 
Br>i we can uniquely decode F. The total size of the representation is [logr] + 2|Bd| bits. 

To bound the size of the representation we show that \Br> \ — 0(r) implying that the representation uses 
[logr] + 2 • 0(r) = 0(f) bits as desired. We first bound the sum Yljer Recall from Lemma Q] that 
the failure function increases by at most 1 between consecutive states in K . Hence, over the subsequence F 
of < r of failure pointers in the range [0, r — 1] the total increase of the failure function can be at most r. 
Hence, Ylj^l' < r - Furthermore, if fx = x, for some x € [0,r — 1], the total decrease of F over a segment 
of r states is at most x plus the total increase and therefore Y2jei> dj > — (x + r) > — 2r. Hence, 

Y,\d j \<2r (4) 
jei' 

Combining ((3]) and (jl]) we have that 




< r + log ((2r/r) r ) = 0{r). 
Thus, we have shown the following result. 

Lemma 3 All light forward and failure transitions of a segment of size r can be encoded using O(rlogcr) 
bits. 
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4.2 Simulating Light Transitions 

Let C = C(P, r) be the segment automaton, and consider the path p of states in the simulation on C from 
a state on some string q. Then, the longest light path from on q is defined as the longest prefix 
of p consisting entirely of light non-accepting transitions in segment i. For example, consider state (1, 1) in 
segment 1 in Fig. [TJ The longest light path on the string q — bac is the path p = (1, 1), (1, 2), (1, 0), (1, 1). 
The transition on c from state (1,1) is heavy and therefore not included in p. 

We show how to quickly compute the length and endpoint of longest light paths. Let S enc be the compact 
encoding of a segment S as described above including the label of the forward heavy transition from the 
rightmost state in S (if any) and a bit indicating whether or not the rightmost light transition is accepting 
or not. Furthermore, let j be a state in S, let q be a string, and define 

Next(5 c11c , j, q): Return the pair (l,f), where I and j' is the length and final state, respectively, of the 
longest light path in S from j matching a prefix of q. 

We can efficiently tabulate Next for arbitrary strings q of length r as follows. Let b be the total number 
of bits needed to represent the input to Next. The string q uses r|~loger] bits and by Lemma [3] 5 cnc 
uses 0(r log a + logo - + 1) = 0(r log a) bits. Furthermore, the state number j uses [logr] bits and hence 
b = Oir log o + logr) = 0(r log a). Using a table T with 2 b entries we can store all results of Next. 
Each entry is computed using a standard simulation in 0(r) time and therefore we can construct T in 
2 b ■ 0(r) = 2°( b ) time and space. Hence, if we have t < 2 W space available for T we may set r = ~ • 

where c > is an upper bound on the constant appearing in the 2°^ expression above. Hence, the total 
space and preprocessing time now becomes 2°^ = 2~ 1 °i"°~ loscr = 0(t). 

With T precomputed and stored in memory we can now answer arbitrary Next queries for arbitrary 
encoded segments and strings of length at most q in constant time. 

5 The Algorithm 

We now put the pieces from the previous sections together to obtain our main result of Theorem [1] Assume 
that we have t < 2 W space available and choose r = 0(logi/ logo - ) as above for the tabulation. We first 
preprocess P by computing the following information: 

• The segment automaton C(P, r) with parameter r and z = \m + 1/r] segments SS = {So, . . . , S z -\}. 

• The compact encoding 5' enc for each segment S 6 SS. 

• The tabulated Next function for segments with r states and input string of length r. 

We compute the segment automaton and the compact encodings in 0(m) time and space. The tabulation 
for Next uses 0(t) time and space and hence the preprocessing uses 0(t + m) time and space. 

We find the occurrences of P in Q using the algorithm described below. The main idea is to simulate the 
segment automaton using the tabulated Next function with the segment automaton. At each iteration of 
the algorithm we traverse light transitions until we either have processed r characters from Q or encounter 
a heavy or accepting transition. We then follow the next transition reporting an occurrence if the transition 
is accepting and repeat until we have read all of Q. 

Algorithm S (Packed String Search). Let P be a string of preprocessed for parameter r as above. Given 
a string Q of length n this algorithm finds all occurrences of P in Q. 

51. [Initialize] Set <- (0,0) and k <- 1. 

52. [Do light transitions] Compute (l,j') < — Next(S'™ c , j, Q[k, min(fc + r, n)]). At this point {i,f) is the 

state in the traversal of C on Q after reading the prefix Q[l,k + I}. All transitions on the string 
Q[k, k + I] are light and non-accepting by the definition of Next. 
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53. [Done?] If k = n the algorithm terminates. 

54. [Do next transition] Compute by following the transition from (i,f) on character Q\k + I + 1]. 

If the transition is a failure transition we set k* <— k + 1 and otherwise set k* <— k + I + 1 . If this is the 
accepting transition report an occurrence ending at position k*. 

55. [Repeat] Update <— and k «— fc* and repeat from step S2. 

It is straightforward to verify that Algorithm S simulates C(P,r) on Q and reports occurrences whenever 
we encounter an accepting transition. In each iteration we either read r character from Q and/or perform a 
heavy or accepting transition. We can process r characters from Q on light transitions at most \n/f\ and 
by Lemma [2] the total number of heavy and accepting transitions is 0(n/r + occ). Hence, the total number 
of iterations is 0(n/r + occ). Since each iteration takes constant time this also bounds the running time. 
Adding the preprocessing time and plugging in r = O(logi/logcr) the time becomes 

O (- +t + m + occ ) = O [ — h t + m + occ ] 

Vr J \\og a t J 

with space 0(t + m). Hence we have the following result. 

Theorem 2 Let P and Q be packed strings of length m and n, respectively. For a parameter t < 2 W we can 
solve the packed string matching problem in time O ( r^r t + t + m + occ^ and space 0(t + m). 

Note that the tabulation is independent of P and we therefore only need to compute it once for multiple 
searches. If we plugin t — n £ , for < e < 1, we obtain an algorithm using time O ( i og + n e + m + occ^ = 

O ^ log " - + m + occj and space 0(n e + m) thereby showing Theorem[T] 

6 Remarks and Open Problems 

We have presented an almost optimal solution for the packed string matching problem on a unit-cost RAM 
with logarithmic word- length. We conclude with two challenging open problems. 

• Our algorithm relies on tabulation to compute the Next function, and therefore its speed is limited by 
the amount of space we have for tables. Consequently, it cannot take advantage of a large word length 
w 3> logn. We wonder if it is possible to obtain a packed string matching algorithm that achieves 
a speed-up over the KMP-algorithm that depends on w rather than \ogn. In particular, it might 
be possible to come up with an algorithm based on word-level parallelism (a.k.a. bit parallelism [7]), 
that uses the arithmetic and logical instructions in the word-RAM instead of tables to perform the 
computation. 

• It would be interesting to obtain fast algorithms for related packed problems. For instance, we wonder if 
it is possible to obtain a similar speed-up for the multi-string matching problem [1]. The Aho-Corasick 
algorithm [1] for multi-string matching uses an automaton that generalizes the KMP-automaton, how- 
ever, it appears difficult to extend our techniques to this automaton. 
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